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Introduction 


J  The  overall  goal  of  this  project  is  to  develop  new  Bayesian 
procedures  for  mental  testing.  A  typical  test,  which  is  studied 
here,  consist  of  k  test  items  administered  to  n  examinees.  The 
data  consists  of  an  nxk  matrix  of  binary  responses  indicting 
which  of  the  k  items  are  scored  correctly  and  which  incorrectly 
by  each  of  the  n  examinees. \ 


The  statistical  procedures  are  based  on  the  assumption  that 


there  is  a  model  which  specifies  the  probability  of  a  correct 
response  to  each  item  as  a  function  of  an  unidimensional 


ability.  Such  functions  are  assumed  to  belong  to  certain 
families  such  as  the  two - parame te r  logistic  (2PL)  or  three- 
parameter  (3PL)  curves.  These  curves  are  identified  by 

parameters  called  item  parameters. ^ 


When  these  models  are  used  for  testing,  a  set  of  items  is 
initially  calibrated  using  a  moderately  large  value  for  n  (the 
sample  size).  The  calibration  consists  of  estimating  the  item 
parameters.  The  calibrated,  curves  are  then  used  to  score 

abilities  of  new  examinees.  :  .  ,  , 

y* dm-'.  ;■  /  .i/fn'.jA..  - 


It  is  standard  practice  to  ignore  the  uncertainty  in  the 
items  once  the  calibration  is  complete  and  to  estimate  abilities 
assuming  that  the  item  parameters  are  known.  This  practice  can 
lead  to  serious  inferential  errors  in  the  measurement  '  of 
abilities.  In  particular,  an  interval  estimate  of  an  ability  can 
be  too  narrow  giving  a  false  impression  of  the  accuracy  of  the 
estimate . 


The  sequential  nature  of  first  calibrating  and  then  scoring 
makes  the  Bayesian  approach  particularly  appropriate.  According 
to  this  approach,  an  analysis  is  made  of  the  uncertainties  in 
the  estimated  items  at  the  calibration  phase.  This  uncertainty 
is  then  taken  into  account  when  abilities  are  measured.  The 
uncertainty  in  the  measured  ability  is  not  only  due  to  the 
randomness  of  responses  from  individuals  with  the  same  ability, 
but  also  due  to  the  uncertainty  in  the  calibrated  items. 

The  Bayesian  paradigm  can  be  extended  to  on-line 
calibration,  where  new  items  are  introduced  with  items  which  have 
already  been  calibrated.  In  this  situation  the  uncertainties  of 
abilities  based  on  the  calibrated  items  are  incorporated  into  the 
uncertainties  of  the  new  items.  Again  the  typical  standard 
practice  is  to  ignore  the  uncertainties  in  the  abilities  of 
individuals  used  for  the  calibration  of  the  new  items. 
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In  order  Co  develop  this  general  Bayesian  approach  to  mental 
testing,  the  research  was  divided  into  the  following  four  topics 
and  the  results  are  outlined  below. 

I.  Development  a  general  Bayesian  framework  for  item  response 
analysis  . 

II.  Estimation  of  item  parameters. 

III.  Estimation  of  abilities. 

IV.  On-line  calibration. 


I.  Bayesian  Framework 

The  general  framework  for  Bayesian  item  response  theory  has 
been  described  in  Tsutakawa  and  Lin  (Psychometrika ,  1986).  Given 
that  the  item  response  curves  belong  to  a  certain  parametric 
family,  a  prior  distribution  for  the  item  parameters  are  assumed. 
The  joint  likelihood  function  of  ability  and  item  parameters  is 
based  on  the  assumption  of  local  independence.  The  ability  are 
assumed  iid  N(0,1).  The  marginal  likelihood  function  is  then  the 
average  of  the  joint  likelihood  function  weighted  by  the  N(0,1) 
prior.  The  marginal  likelihood  function  is  then  multiplied  to 
the  prior  to  get  the  (unnormalized)  posterior  for  the  item 
parameters.  The  marginal  posterior  for  the  ability  parameter  can 
be  similarly  expressed  but  is  not  easy  to  work  with  due  to  the 
multiple  integrals  involved. 


II.  Estimation  of  item  parameters 


The  general  approach  developed  for  item  parameter  estimation 
is  to  use  as  point  estimate  the  posterior  mode  and  as  measure  of 
uncertainty  the  posterior  covariance  matrix.  The  use  of  the  EM 
algorithm  for  computing  the  posterior  mode  is  described  in 
Tsutakawa  and  Lin  (1986).  A  novel  feature  of  this  paper  is  the 
use  of  the  ordered  bivariate  beta  to  form  a  prior  distribution 
for  Che  item  parameters  in  2PL.  This  paper  proposes  the  use  of 
the  inverse  posterior  information  matrix  to  approximate  the 
posterior  covariance  matrix.  It  also  illustrates  the  relative 
closeness  of  estimated  values  in  repeated  samples,  when  compared 
to  standard  methods  such  as  LOGIST  (Wingersky,  Barton  &  Lord 
1982). 

In  Rlgdon  and  Tsutakawa  (JES  1987)  an  empirical  Bayes 
procedure  is  developed  for  the  case  In  which  both  ability  and 
item  parameters  are  sampled  from  population  distributions  with 
unknown  hyperparameters.  Here  the  EM  algorithm  (Dempster,  Laird, 
&  Rubin,  JRSS-B  1977)  is  modified  to  simultaneously  estimate  the 
hyperparameters.  Simulations  are  used  to  show  the  robustness  of 
this  approach  relative  to  marginal  maximum  likelihood  in  the 
case  of  the  Rasch  model. 


Code* 
l/or  * 
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The  use  of  Che  Dirichlec  distribution  to  form  a  prior 
distribution  for  item  parameters  in  3PL  is  studied  in  Tsutakawa 
(TR143,  1988).  The  conventional  Bayesian  approach  assume  prior 
independence  of  parameters  within  items.  This  paper  suggests  a 
simple  device  to  represent  the  prior  dependence  among  parameters 
within  items.  The  emphasis  here  is  on  looking  at  curves  rather 
than  parameters.  Bayesian  modal  estimates  are  compared  with 
LOGIST  (Uingersky,  Barton,  and  Lord,  1982)  and  marginal  maximum 
likelihood.  The  robustness  of  the  Bayesian  estimate  relative  to 
weights  placed  on  the  prior  is  also  illustrated.  One  notable 
feature  of  the  Bayesian  method  is  that  there  are  much  fewer 
outliers  with  unreasonable  values. 


Ill  Ability  estimation 

Bayesian  approximations  to  the  posterior  mean  and  variance 
of  ability  are  proposed  and  illustrated  for  2PL  in  Tsutakawa  and 
Soltys  (JES,  1988).  The  standard  empirical  Bayes  approximations 
are  posterior  moments  conditional  on  assuming  the  unknown  item 
parameters  to  equal  those  estimated  at  the  calibration  phase. 
The  new  approximation  modifies  this  by  adding  terms  representing, 
and  correcting  for,  the  uncertainties  of  the  calibrated  item 
parameters.  It  is  a  special  case  of  Lindley’s  (1980) 
approximation  when  the  3rd  partial  derivatives  of  the 
logposterior  vanish.  The  new  approximation  shows  that  the 
empirical  Bayes  approximation  consistently  underestimates  the 
posterior  variance.  Other  approximations,  including  those  by 
Leonard  (1982)  and  Tierney  and  Kadane  (1986),  have  also  been 
examined  and  found  to  require  an  excessive  amount  of  computing 
and  therefore  not  suitable  for  routine  use  in  ability  estimation. 

The  Bayesian  approximation  was  then  extended  to  3PL  in 
Tsutakawa  and  Johnson  (TR147  ,  1988).  This  paper  demonstrates 
that  maximum  likelihood  and  empirical  Bayes,  both  of  which 
replace  unknown  item  parameters  by  those  estimated,  grossly 
underestimate  the  variance  of  the  ability  parameters.  The 
numerical  examples,  upon  which  much  of  the  conclusion  is  reached, 
is  based  on  a  sample  of  n~400.  Although  the  discrepancies 
between  the  procedures  should  decrease  as  n  increases,  there  is 
some  feeling  at  even  at  n-1600  the  differences  might  not  be 
negligible . 


IV  On-line  calibration 

Work  on  this  topic  remains  incomplete  due  to  the  delay 
encountered  in  developing  computer  programs  for  Bayesian  ability 
estimation  under  3PL.  The  delay  was  due  to  untimely  personnel 
changes,  which  required  finding  and  training  a  new  computer 
programmer  each  time  a  person  left. 
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Summary 


The  development  of  Bayesian  item  response  theory  requires 
considerable  amount  of  computation  and  new  techniques  for 
approximating  posterior  distributions.  This  research 
demonstrates  that  computational  problems  (though  far  from  solved) 
can  be  dealt  with  by  careful  use  of  asymptotic  approximations. 
It  also  demonstrates  that  reasonable  prior  distributions  can  be 
formulated  in  spite  of  the  complexities  of  IRT  models.  But  more 
importantly  it  shows  the  feasibility  of  developing  a 
comprehensive  and  complete  theory  which  can  be  adapted  to  large 
scale  testing  environments. 
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.  f  i . 

University  of  Alberta 
Edmonton,  Alberta 
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N.3.M.  2308 
AUSTRALIA 


Dr.  G.  Gage  Kingsbury 

Portland  Public  Schools 

Research  and  Evaluation  Department 

501  North  Dixon  Street 
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Department  of  Educational 
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San  Diego,  CA  92152-6800 

Dr.  Dan  Segal  I 

Navy  Personnel  R&D  Center 

San  Diego,  CA  92152 


1988/07/06 


University  of  Mi  ssour i -Co  1 umb i a/ fsutak awa 


Or.  M.  Steve  Sell  man 
OASO(MRAai) 

28269  The  Pentagon 
Washington,  OC  20301 

Or.  Kazuo  Shigemasu 
7-9-24  Kugenuma-Ka i gan 
Fuj i sawa  251 
JAPAN 

Dr .  William  Sims 
Center  tor  Naval  Analysis 
4401  Ford  Avenue 
P.0.  Box  16268 
Alexandria,  VA  22302-0268 

Or.  H.  Wallace  Sinaiko 
Manpower  Research 

and  Advisory  Services 
Smithsonian  Institution 
801  North  Pitt  Street,  Suite  120 
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13 


1988/07/06 


University  of  Mi ssour i -Co I umb i a/ 1 sutakawa 


Dr.  Vern  W.  Urry 
Personnel  R&D  Center 
Office  of  Personnel  Management 
1900  E.  Street,  NW 
Washington,  DC  20415 

Dr.  David  Vale 
Assessment  Systems  Corp. 

2233  University  Avenue 
Su i te  440 


Or.  Douglas  Wetzel 
Code  51 

Navy  Personnel  R&O  Center 
San  Dieqo,  CA  92152-6800 

Dr.  Wand  R.  Wi Icox 
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14 


1 988/0 V/06 


University  of  Ni  ssour  i —Co  *  umh  i  a/  Isutakawa 


Or.  Kentaro  Yamamoto 
03-T 

Educational  Testing  Service 
Rosedale  Road 
Princeton.  NJ  08541 

Or.  Wendy  Yen 
CTB/McGraw  Hill 
Del  Monte  Research  Park 
Monterey,  CA  93940 

Or.  Joseph  L.  Young 

Nat  lonal  Science  foundation 

Room  320 

1800  G  Street.  N.W. 
Washington,  DC  20550 

Mr.  Anthony  R.  Zara 
National  Council  of  State 
Boards  of  Nursing,  Inc. 
625  North  Michigan  Avenue 
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